Linear-scaling ab-initio calculations for large and complex systems 
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A brief review of the Siesta project is presented in the context of linear-scaling density-functional 
methods for electronic-structure calculations and molecular-dynamics simulations of systems with 
a large number of atoms. Applications of the method to different systems are reviewed, including 
carbon nanotubes, gold nanostructures, adsorbates on silicon surfaces, and nucleic acids. Also, 
progress in atomic-orbital bases adapted to linear-scaling methodology is presented. 
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I. INTRODUCTION 



It is clearer every day the contribution that first- 
principles calculations are making to several fields in 
physics, chemistry, and recently geology and biology. The 
steady increase in computer power and the progress in 
methodology have allowed the study of increasingly more 
complex and larger systems |l| . It has been only recently 
that the scaling of the computation expense with the sys- 
tem size has become an important issue in the field. Even 
efficient methods, like those based on Density-Functional 
theory (DFT), scale like A^ 2-3 , being A~ the number of 
atoms in the simulation cell. Q This problem stimulated 
the the first ideas for methods which scale linearly with 
system size a field that has been the subject of im- 
portant efforts ever since (|). 

The key for achieving linear scaling is the explicit use 
of locality, meaning by it the insensitivity of the proper- 
ties of a region of the system to perturbations sufficiently 
far away from it || . A local language will thus be needed 
for the two different problems one has to deal with in a 
DFT-like method: building the self-consistent Hamilto- 
nian, and solving it. Most of the initial effort was dedi- 
cated to the latter using empirical or semi-empirical 
Hamiltonians. The Siesta project started in 1995 
to address the former. Atomic-orbital basis sets were 
chosen as the local language, allowing for arbitrary basis 
sizes, what resulted in a general-purpose, flexible linear- 
scaling DFT program |7],|| . A parallel effort has been the 
search for orbital bases that would meet the standards 
of precision of conventional first-principles calculations, 
but keeping as small a range as possible for maximum 
efficiency. Several techniques are presented here. 

Other approaches pursued by other groups are also 
shortly reviewed in section II. All of them are based on 
local bases with different flavors, offering a fair variety of 
choice between systematicity and efficiency. Our devel- 
opments of atomic bases for linear-scaling are presented 
in section III. Siesta has been applied to quite varied 



systems during these years, ranging from metal nanos- 
tructures to biomolecules. Some of the results obtained 
are briefly reviewed in section IV. 



II. METHOD AND CONTEXT 

Siesta is based on DFT, using local-density Q and 
generalized-gradients functionals 0, including spin po- 



larization, collinear and non-collinear |10|. Core electrons 
are replaced by norm- conserving pseudopotentials |Tl] ] 
factorized in the Kleinman-Bylander form Jl2| , including 
scalar-relativistic effects, and non-linear partial-core cor- 
rections . The one-particle problem is then solved us- 
ing linear combination of atomic orbitals (LCAO). There 
are no constraints either on the radial shape of these or- 
bitals (numerically treated), or on the size of the basis, 
allowing for the full quantum-chemistry know-how H ] 
(multiple-^, polarization, off-site, contracted, and diffuse 
orbitals). Forces on the atoms and the stress tensor are 
obtained from the Hcllmann-Feynman theorem with Pu- 
lay corrections Q , and are used for structure relaxations 
or molecular dynamics simulations of different types. 

Firstly, given a Hamiltonian, the one-particle 
Schrodinger equation is solved yielding the energy and 
density matrix for the ground state. This task is per- 
formed either by diagonalization (cube-scaling, appropri- 
ate for systems under a hundred atoms or for metals) or 
with a linear-scaling algorithm. These have been exten- 
sively reviewed elsewhere ||. Siesta implements two 
O(N) algorithms PJl5|] based on localized Wannier-like 
wavef unctions. 

Secondly, given the density matrix, a new Hamiltonian 
matrix is obtained. There are different ways proposed 
in the literature to perform this calculation in order-A^ 
operations. 

(i) Quantum chemists have explored algorithms for 
Gaussian-type orbitals (GTO) and related technology 
0. The Ion g-range Hartree potential posed an impor- 
tant problem that has been overcome with Fast Mul- 
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tipole Expansion techniques plus near-field corrections 
p6| . Within this approach, periodic boundary conditions 
for extended systems require additional techniques that 
are under current development ]l7[ |. 

(ii) Among physicists tradition favors more systematic 
basis sets, such as plane-waves and variations thereof. 
Working directly on a real-space grid was early proposed 
as a natural possibility for linear scaling [p|. Multigrid 
techniques allow efficient treatment of the Hartree prob- 
lem, making it very attractive. However, a large prefactor 
was found jl8j for the linear scaling, making the order- N 
calculations along this line not so practical for the mo- 
ment. The introduction of a basis of localized functions 
on the points of the grid (blips) was then proposed as 
an operative method within the original spirit |f9f| . It is 
probably more expensive than LCAO alternatives, but 
with the advantage of a systematic basis. Another ap- 
proach pfj works with spherical Bessel functions confined 
to (overlapping) spheres wisely located within the simu- 
lation cell. As for plane-waves, a kinetic energy cutoff 
defines the quality of the basis within one sphere. The 
number, positioning, and radii of the spheres are new 
variables to consider, but the basis is still more system- 
atic than within LCAO. 

{Hi) There are mixed schemes that use atomic-orbital 
bases but evaluate the matrix elements using plane-wave 
or real-space-grid techniques. The method of Lippert 
et al. uses GTO's and associated techniques for the 
computation of the matrix elements of some terms of the 
Kohn-Sham Hamiltonian. It uses plane-wave representa- 
tions of the density for the calculation of the remaining 
terms. This latter method is conceptually very similar to 
the one presented earlier by Ordejon et al. J5|, on which 
Siesta is based. The matrix elements within Siesta are 
also calculated in two different ways [Q: some Hamilto- 
nian terms in a real-space grid and other terms (involv- 
ing two-center integration) by very efficient, direct LCAO 
integration |22]. While Siesta uses numerical orbitals, 
Lippert 's method works with GTOs, which allow analytic 
integrations, but require more orbitals. 

Except for the quantum-chemical approaches, the 
methods mentioned require smooth densities, and thus 
soft pseudopotentials. A recent augmentation proposal 
p3| allows a substantial improvement in grid convergence 
of the method of Lippert et al. , possibly allowing for 
all-electron calculations. 

III. ATOMIC ORBITALS ADAPTED TO LINEAR 
SCALING 

The main advantage of atomic orbitals is their effi- 
ciency (fewer orbitals needed per electron for similar pre- 
cision) and their main disadvantage is the lack of sys- 
tematics for optimal convergence, an issue that quantum 
chemists have been working on for many years pi . They 



have also clearly shown that there is no limitation on pre- 
cision intrinsic to LCAO. 

Orbital range. The need for locality in linear-scaling 
algorithms imposes a finite range for matrix elements, 
which has a strong influence on the efficiency of the 
method. There is a clear challenge ahead for finding 
short-range bases that still give a high precision. The 
traditional way is to neglect matrix elements between far- 
away orbitals with values below a tolerance. This proce- 
dure implies a departure from the original Hilbert space 
and it is numerically unstable for short ranges. Instead, 
the use of orbitals that would strictly vanish beyond a 
certain radius was proposed This gives sparse ma- 
trices consistently within the Hilbert space spanned by 
the basis, numerically robust even for small ranges. 

In the context of Siesta, the use of pseudopotentials 
imposes basis orbitals adapted to them. Pseudoatomic 
orbitals (PAOs) are used, i.e., the DFT solution of the 
atom with the pseudopotential. PAO's confined by a 
spherical infinite-potential wall 0], has been the start- 
ing point for our bases. Fig. 1 shows s and p confined 
PAOs for oxygen. Smoother confining potentials have 
been proposed as a better converging alternative 0] . 

A single parameter that defines the confinement radii 
of different orbitals is the orbital energy shift [ p5| , 
AEpACb i.e. , the energy increase that each orbital ex- 
periences when confined to a finite sphere. It defines all 
radii in a well balanced way and allows the systematic 
convergence of physical quantities to the required pre- 
cision. Fig. 2 shows the convergence of geometry and 
cohesive energy with A£?pao f° r various systems. It 
varies depending on the system and physical quantity, 
but A£pao ~ 100 — 200 meV gives typical precisions 
within the accuracy of current GGA functionals. 

Multiple-^. To generate confined multiple-C bases, a 
first proposal (|(| suggested the use of the excited PAOs 
in the confined atom. It works well for short ranges, 
but shows a poor convergence with AEpaoj since some 
of these orbitals are unbound in the free atom. In the 
split-valence scheme, widely used in quantum chemistry, 
GTOs that describe the tail of the atomic orbitals are left 
free as separate orbitals for the extended basis. Adding 
the quantum-chemistry JIJ] GTOs' tails to the PAO 
bases gives flexible bases, but the confinement control 
with AEpao is lost. The best scheme used in Siesta 
calculations so far is based on the idea (2?]] of adding, 
instead of a GTO, a numerical orbital that reproduces 
the tail of the PAO outside a radius Rbz, and contin- 
ues smoothly towards the origin as r l (a — br 2 ), with a 
and b ensuring continuity and differentiability at -Rdz- 
This radius is chosen so that the norm of the tail be- 
yond has a given value. Variational optimization of this 
split norm performed on different systems shows a very 
general and stable performance for values around 15% 
(except for the ~ 50% for hydrogen) . Within exactly the 
same Hilbert space, the second orbital can be chosen as 
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the difference between the smooth one and the original 
PAO, which gives a basis orbital strictly confined within 
the matching radius Ruz, i-e., smaller than the original 
PAO. This is illustrated in Fig. 1. Multiple-^ is obtained 
by repetition of this procedure. 

Polarization orbitals. A shell with angular momentum 
I + 1 (or more shells with higher I) is usually added to 
polarize the most extended atomic valence orbitals (I), 
giving angular freedom to the valence electrons. The 
(empty) I + 1 atomic orbitals are not necessarily a good 
choice, since they are typically too extended. The normal 
procedure within quantum chemistry |Q is using GTOs 
with maximum overlap with valence orbitals. Instead, 
we use for Siesta the numerical orbitals resulting from 
the actual polarization of the pseudoatom in the presence 
of a small electric field 25 1. The pseudoatomic problem 
is then exactly solved (within DFT), yielding the I + 1 
orbitals through comparison with first order perturbation 
theory. The range of the polarization orbitals is defined 
by the range of the orbitals they polarize. It is illustrated 
in Fig. 3 for the d orbitals of silicon. 

The performance of the schemes presented here has 
been tested for various applications (see below) and a sys- 
tematic study will be presented elsewhere Q . It has been 
found in general that double-C, singly polarized (DZP) 
bases give precisions within the accuracy of GGA func- 
tional for geometries, energetics and elastic/vibrational 
properties. 

Other possibilities. Scale factors on orbitals are also 
used, both for orbital contraction and for diffuse orbitals. 
Off-site orbitals can be introduced. They serve for the 
evaluation of basis-set superposition errors [28[| . Spheri- 
cal Bcssel functions are also included, that can be used 
for mixed bases between our approach and the one of 
Haynes and Payne pi . 



so, even for sizes for which very favorable geometric struc- 
tures had been proposed before. In a further study the 
origin of this striking situation is explained in terms of 
local stresses . Chains of gold atoms have been stud- 
ied addressing the experiments p4| which show them dis- 
playing remarkably long interatomic spacings (4 - 5 A) . A 
first study (3^] arrives at the conclusion that a linear gold 
chain would break at interatomic spacings much smaller 
than the observed ones. It is illustrated in Fig. 4 [Q. A 
possible explanation of the discrepancy is reported else- 
where. |36| ] 

Surfaces and Adsorption. A molecular dynamics sim- 
ulation was performed ]37[ ] on the clean surface of liquid 
silicon close to the melting temperature, in which surface 
layering was found, i.e., density oscillations of roughly 
atomic amplitude, like what was recently found to hap- 
pen in the surface of other liquid metals J38|. Unlike 
them, though, the origin for silicon was found to be ori- 
entational, reminescent of directed octahedral bonding. 
Adsorption studies have also been performed on solid sil- 
icon surfaces, Ba on Si(100) f§ and C 60 on Si(lll) go). 
Both works study adsorption geometries and energetics. 
For Ba, interactions among adsorbed atoms and diffu- 
sion features are studied. For Cgo, STM images have 
been simulated and compared to experiments. 

Nucleic Acids. Feasibility tests on DNA were per- 
formed in the early stages of the project, by relaxing 
a dry B-form poly(dC)-poly(dG) structure with a min- 
imal basis (10]. In preparation of realistic calculations, 
a thorough study |2£| of 30 nucleic acid pairs has been 
performed addressing the precision of the approximations 
and the DZP bases, and the accuracy of the GGA func- 
tional obtaining good results even for the hydrogen 
bridges. Based on that, a first study of dry A-DNA has 
been performed, with a full relaxation of the structure, 
and an analysis of the electronic characteristics Ell . 



IV. BRIEF REVIEW OF APPLICATIONS 

Carbon Nano structures. A preliminary version of 
Siesta was first applied to study the shape of large hol- 
low carbon fullerenes Q up to C540, the results contribut- 
ing to establish that they do not tend to a spherical-shape 
limit but tend to facet around the twelve corners given 
by the pentagons. Siesta has been also applied to car- 
bon nanotubes. In a first study, structural, elastic and 
vibrational properties were characterized [p9| . A second 
work was dedicated to their deposition on gold surfaces, 
and the STM images that they originate pC|j , specially 
addressing experiments on finite-length tubes. A third 
study has been dedicated to the opening of single-wall 
nanotubes with oxygen, and the stability of the open, 
oxidized tubes for intercalation studies |n| . 

Gold Nano structures. Gold nanoclusters of small sizes 
(up to AU75) were found [p2| to be amorphous, or nearly 



V. CONCLUSIONS 

The status of the Siesta project has been briefly re- 
viewed, putting it in context with other methods of liner- 
scaling DFT, and briefly describing results obtained with 
Siesta for a variety of systems. The efforts dedicated 
to finding schemes for atomic bases adapted to linear- 
scaling have been also described. A promising field still 
very open for future research. 
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FIG. 1. Confined pseudoatomic orbitals for oxygen, s in 
(a) and (b). p in (c) and (d). R c is the confinement radius 
obtained for ASpao = 250 meV. The original PAOs are rep- 
resented with thinner lines. The split smooth functions are 
plotted with thicker lines in (a) and (c), while the resulting 
double-^ orbitals are plotted with thicker lines in (b) and (d). 

FIG. 2. Convergence with energy shift ASpao of (a) lat- 
tice parameters of bulk Si (o), Au (*), and MgO (•), and bond 
length (A) and angle (x) of H2O; and (b) corresponding co- 
hesive (bond) energies. 
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FIG. 3. d polarization orbitals for silicon for two different 
confinement conditions, (a) Obtained with the electric-field 
polarization method, and (b) the confined d PAOs. 

FIG. 4. Cohesive energy (a), and stretching force (b) in a 
linear gold chain as a function of interatomic distance. Black 
dots are for the translationally invariant chain, white circles 
and squares are for supercells of 4 and 8 atoms, respectively, 
where the system is allowed to break. 
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